Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase run DAISIE ml short jobs time #35

Merged
merged 1 commit into from
Dec 11, 2023
Merged

Increase run DAISIE ml short jobs time #35

merged 1 commit into from
Dec 11, 2023

Conversation

joshwlambert
Copy link
Collaborator

This PR increase the job time allocation for short DAISIE ML jobs from 2 hours to 5 hours. This is to ensure that large datasets or complex DAISIE models can finish optimisation before being terminated. If the job takes longer we recommend using the _long.sh bash scripts which allocate 10 days of running.

@Neves-P I set this to 5 hours as this should be plenty for DAISIE optimisation to finish (some of my jobs we being cut off by the 2 hour limit). I'm unsure of what the slurm algorithm is for prioritising jobs based on time allocation, I'm hoping 5 hours is still considered a "short job". Let me know if you think we should change this. Assuming all the CI checks pass I will merge this into develop and then main.

@Neves-P
Copy link
Member

Neves-P commented Dec 11, 2023

@joshwlambert iirc, slurm short jobs atm are <2 days, so 5 hours is perfectly reasonable. Ideally we want to keep them as short as possible even regardless the queue partition they get allocated to, since shorter jobs will have an easier time being scheduled.
Be aware /scratch is currently down (and hence so is Habrok, updates here), so I would hold off on submitting anything until that is resolved.

@Neves-P Neves-P self-requested a review December 11, 2023 10:43
Copy link
Member

@Neves-P Neves-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@Neves-P Neves-P merged commit 416ced9 into develop Dec 11, 2023
7 checks passed
@Neves-P Neves-P deleted the incre_job_time branch December 11, 2023 10:44
@Neves-P
Copy link
Member

Neves-P commented Dec 11, 2023

Sorry, I jumped the gun and merged, but I think all is fine for now.

@joshwlambert
Copy link
Collaborator Author

@Neves-P all good, thanks for checking and for the update on Habrok. Any estimate on when it will be back up and running? (I'll keep an eye on the status)

Thanks for merging, one less job for me 😉 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants